Evaluation of a Potential for Automatic SIMD Parallelization of Embedded Applications

نویسندگان

  • Rashindra Manniesing
  • Ireneusz Karkowski
  • Henk Corporaal
چکیده

This paper investigates the potential for automatic mapping of typical embedded applications to architectures with multimedia instruction set extensions. For this purpose a (pattern matching based) code transformation engine is used. The experiments show that about 85% of the loops which are suitable for Single Instruction Multiple Data (SIMD) parallelization can be automatically recognized and mapped.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Automatic SIMD Parallelization of Embedded Applications Based on Pattern Recognition

This paper investigates the potential for automatic mapping of typical embedded applications to architectures with multimedia instruction set extensions. For this purpose a (pattern matching based) code transformation engine is used, which involves a three-step process of matching, condition checking and replacing of the source code. Experiments with DSP and the MPEG2 encoder benchmarks, show t...

متن کامل

Performance Evaluation of Parallel Simd

A simulator for SIMD type architectures is presented. Starting from an architecture independent algorithm description based on recurrence equations, transformation steps for automatic parallelization, mapping and code generation are outlined. The nal pseudo code program together with architecture dependent parameters and execution time tables, are fed into the simulator in order to gain perform...

متن کامل

Experimental Evaluation of A ne Schedules for Matrix Multiplication on the MasPar Architecture

This paper reports an experimental study on the suitability of systolic algorithms scheduling methods to the automatic parallelization of algorithms on SIMD computers. We consider the matrix multiplication on the MasPar MP-1 architecture. We comparatively study diierent scheduling methods and the blocking of the best resulting algorithms.

متن کامل

Parallel Implementation of Real-Time Block-Matching based Motion Estimation on Embedded Multi-Core Architectures

Considering the strict demands of video-based advanced driver-assistance systems in terms of real-time execution, complex applications are usually realized with dedicated hardware solutions. Indeed, modern vector-accelerated multi-core processors, serving as attractive off-the-shelf components, feature increasing computational performance, while executing flexible and maintainable software code...

متن کامل

Automatic Transformations for Effective Parallel Execution on Intel Many Integrated Core

We demonstrate in this work the potential effectiveness of a source-to-source framework for automatically optimizing a sub-class of affine programs on the Intel Many Integrated Core Architecture. Data locality is achieved through complex and automated loop transformations within the polyhedral framework to enable parallel tiling, and the resulting tiles are processed by an aggressive automatic ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2007